AITopics | dataset metadata

Collaborating Authors

dataset metadata

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Masader Plus: A New Interface for Exploring +500 Arabic NLP Datasets

Altaher, Yousef, Fadel, Ali, Alotaibi, Mazen, Alyazidi, Mazen, Al-Mutairi, Mishari, Aldhbuiub, Mutlaq, Mosaibah, Abdulrahman, Rezk, Abdelrahman, Alhendi, Abdulrazzaq, Shal, Mazen Abo, Alghamdi, Emad A., Alshaibani, Maged S., Zakraoui, Jezia, Mohammed, Wafaa, Gaanoun, Kamel, Elmadani, Khalid N., Ghaleb, Mustafa, Tazi, Nouamane, Alharbi, Raed, Masoud, Maraim, Alyafeai, Zaid

arXiv.org Artificial IntelligenceAug-1-2022

Masader (Alyafeai et al., 2021) created a metadata structure to be used for cataloguing Arabic NLP datasets. However, developing an easy way to explore such a catalogue is a challenging task. In order to give the optimal experience for users and researchers exploring the catalogue, several design and user experience challenges must be resolved. Furthermore, user interactions with the website may provide an easy approach to improve the catalogue. In this paper, we introduce Masader Plus, a web interface for users to browse Masader. We demonstrate data exploration, filtration, and a simple API that allows users to examine datasets from the backend. Masader Plus can be explored using this link https://arbml.github.io/masader. A video recording explaining the interface can be found here https://www.youtube.com/watch?v=SEtdlSeqchk.

dataset, masader plus, metadata, (15 more...)

arXiv.org Artificial Intelligence

2208.00932

Country:

Asia > Middle East > Bahrain (0.05)
North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(8 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

AutoML using Metadata Language Embeddings

Drori, Iddo, Liu, Lu, Nian, Yi, Koorathota, Sharath C., Li, Jie S., Moretti, Antonio Khalil, Freire, Juliana, Udell, Madeleine

arXiv.org Machine LearningOct-8-2019

As a human choosing a supervised learning algorithm, it is natural to begin by reading a text description of the dataset and documentation for the algorithms you might use. We demonstrate that the same idea improves the performance of automated machine learning methods. We use language embeddings from modern NLP to improve state-of-the-art AutoML systems by augmenting their recommendations with vector embeddings of datasets and of algorithms. We use these embeddings in a neural architecture to learn the distance between best-performing pipelines. The resulting (meta-)AutoML framework improves on the performance of existing AutoML frameworks. Our zero-shot AutoML system using dataset metadata embeddings provides good solutions instantaneously, running in under one second of computation. Performance is competitive with AutoML systems OBOE, AutoSklearn, AlphaD3M, and TPOT when each framework is allocated a minute of computation. We make our data, models, and code publicly available.

dataset, dataset metadata, pipeline, (13 more...)

arXiv.org Machine Learning

1910.03698

Country: North America > Canada (0.04)

Genre:

Research Report (0.64)
Overview (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.51)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

Beyond research data infrastructures: exploiting artificial & crowd i…

#artificialintelligenceOct-4-2019, 07:57:53 GMT

Web pages indexed by Google (plus gazillion of temporal snapshots) Embedded markup (RDFa, Microdata, Microformats) for annotation of Web pages Supports Web search & interpretation Pushed by Google, Yahoo, Bing et al (schema.org Factual errors, annotation errors (see also [Meusel et al, ESWC2015]) o Ambiguity & coreferences. Relevance: supervised coreference resolution 2.) Quality & redundancy: data fusion through supervised fact classification (SVM, knn, RF, LR, NB), diverse feature set (authority, relevance etc), considering source- (eg PageRank), entity-, & fact-level KnowMore: data fusion on markup 02/10/19 11 1. Relevance: supervised coreference resolution 2.) Quality & redundancy: data fusion through supervised fact classification (SVM, knn, RF, LR, NB), diverse feature set (authority, relevance etc), considering source- (eg PageRank), entity-, & fact-level KnowMore: data fusion on markup 02/10/19 12 1. Rich Context & Coleridge Initiative building (yet another) KG of scholarly resources & datasets 13Stefan Dietze Context/corpus: publications (currently: social sciences, SAGE Publishing) Tasks: I. Extraction/disambiguation of dataset mentions II.

dietze, germany, research data, (15 more...)

#artificialintelligence

Country:

Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.05)
Europe > France (0.05)
Asia > India > Bihar > Patna (0.05)
(4 more...)

Genre: Research Report > Experimental Study (0.56)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.45)
Information Technology > Communications > Social Media > Crowdsourcing (0.41)

Add feedback